AITopics | two-step approach

Collaborating Authors

two-step approach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimizing affinity-based binary hashing using auxiliary coordinates

Ramin Raziperchikolaei, Miguel A. Carreira-Perpinan

Neural Information Processing SystemsApr-22-2026, 03:42:58 GMT

In supervised binary hashing, one wants to learn a function that maps a highdimensional feature vector to a vector of binary codes, for application to fast image retrieval. This typically results in a difficult optimization problem, nonconvex and nonsmooth, because of the discrete variables involved. Much work has simply relaxed the problem during training, solving a continuous optimization, and truncating the codes a posteriori. This gives reasonable results but is quite suboptimal. Recent work has tried to optimize the objective directly over the binary codes and achieved better results, but the hash function was still learned a posteriori, which remains suboptimal. We propose a general framework for learning hash functions using affinity-based loss functions that uses auxiliary coordinates. This closes the loop and optimizes jointly over the hash functions and the binary codes so that they gradually match each other. The resulting algorithm can be seen as an iterated version of the procedure of optimizing first over the codes and then learning the hash function. Compared to this, our optimization is guaranteed to obtain better hash functions while being not much slower, as demonstrated experimentally in various supervised datasets.

artificial intelligence, hash function, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Review for NeurIPS paper: A new inference approach for training shallow and deep generalized linear models of noisy interacting neurons

Neural Information Processing SystemsJan-23-2025, 09:34:08 GMT

Additional Feedback: - The authors claim that empirically they do not need large amounts of repeated stimuli for the method to work. This empirical claim is based on only a single experimental dataset. It would be nice to see some theoretical analysis or exploration into how much data is needed for this to work -- presumably if my data has only 2 repeats of a stimulus then the h_stim auxilliary variable could be very poorly estimated. This introduces a bias into the results of the model, but how bad is this bias? Is this correction procedure provably optimal in some way?

deep generalized linear model, neurips paper, new inference approach, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Exploring Large Language Models for Product Attribute Value Identification

Sabeh, Kassem, Kacimi, Mouna, Gamper, Johann, Litschko, Robert, Plank, Barbara

arXiv.org Artificial IntelligenceSep-19-2024

Product attribute value identification (PAVI) involves automatically identifying attributes and their values from product information, enabling features like product search, recommendation, and comparison. Existing methods primarily rely on fine-tuning pre-trained language models, such as BART and T5, which require extensive task-specific training data and struggle to generalize to new attributes. This paper explores large language models (LLMs), such as LLaMA and Mistral, as data-efficient and robust alternatives for PAVI. We propose various strategies: comparing one-step and two-step prompt-based approaches in zero-shot settings and utilizing parametric and non-parametric knowledge through in-context learning examples. We also introduce a dense demonstration retriever based on a pre-trained T5 model and perform instruction fine-tuning to explicitly train LLMs on task-specific instructions. Extensive experiments on two product benchmarks show that our two-step approach significantly improves performance in zero-shot settings, and instruction fine-tuning further boosts performance when using training data, demonstrating the practical benefits of using LLMs for PAVI.

dataset, extraction, product title, (16 more...)

arXiv.org Artificial Intelligence

2409.12695

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Multilingual De-Duplication Strategies: Applying scalable similarity search with monolingual & multilingual embedding models

Pasch, Stefan, Petridis, Dimitirios, Cutura, Jannic

arXiv.org Artificial IntelligenceJun-19-2024

This paper addresses the deduplication of multilingual textual data using advanced NLP tools. We compare a two-step method involving translation to English followed by embedding with mpnet, and a multilingual embedding model (distiluse). The two-step approach achieved a higher F1 score (82% vs. 60%), particularly with less widely used languages, which can be increased up to 89% by leveraging expert rules based on domain knowledge. We also highlight limitations related to token length constraints and computational efficiency. Our methodology suggests improvements for future multilingual deduplication tasks.

dataset, duplicate, multilingual, (16 more...)

arXiv.org Artificial Intelligence

2406.13695

Country: Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.05)

Genre: Research Report (0.82)

Industry: Government (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Learning and Optimization for Price-based Demand Response of Electric Vehicle Charging

Gu, Chengyang, Pan, Yuxin, Liu, Ruohong, Chen, Yize

arXiv.org Artificial IntelligenceApr-16-2024

In the context of charging electric vehicles (EVs), the price-based demand response (PBDR) is becoming increasingly significant for charging load management. Such response usually encourages cost-sensitive customers to adjust their energy demand in response to changes in price for financial incentives. Thus, to model and optimize EV charging, it is important for charging station operator to model the PBDR patterns of EV customers by precisely predicting charging demands given price signals. Then the operator refers to these demands to optimize charging station power allocation policy. The standard pipeline involves offline fitting of a PBDR function based on historical EV charging records, followed by applying estimated EV demands in downstream charging station operation optimization. In this work, we propose a new decision-focused end-to-end framework for PBDR modeling that combines prediction errors and downstream optimization cost errors in the model learning stage. We evaluate the effectiveness of our method on a simulation of charging station operation with synthetic PBDR patterns of EV customers, and experimental results demonstrate that this framework can provide a more reliable prediction model for the ultimate optimization process, leading to more effective optimization solutions in terms of cost savings and charging station operation objectives with only a few training samples.

customer, demand response, ev customer, (14 more...)

arXiv.org Artificial Intelligence

2404.10311

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Modeling & Simulation (0.69)

Add feedback

Two-step Automated Cybercrime Coded Word Detection using Multi-level Representation Learning

Kim, Yongyeon, On, Byung-Won, Lee, Ingyu

arXiv.org Artificial IntelligenceMar-16-2024

In social network service platforms, crime suspects are likely to use cybercrime coded words for communication by adding criminal meanings to existing words or replacing them with similar words. For instance, the word 'ice' is often used to mean methamphetamine in drug crimes. To analyze the nature of cybercrime and the behavior of criminals, quickly detecting such words and further understanding their meaning are critical. In the automated cybercrime coded word detection problem, it is difficult to collect a sufficient amount of training data for supervised learning and to directly apply language models that utilize context information to better understand natural language. To overcome these limitations, we propose a new two-step approach, in which a mean latent vector is constructed for each cybercrime through one of five different AutoEncoder models in the first step, and cybercrime coded words are detected based on multi-level latent representations in the second step. Moreover, to deeply understand cybercrime coded words detected through the two-step approach, we propose three novel methods: (1) Detection of new words recently coined, (2) Detection of words frequently appeared in both drug and sex crimes, and (3) Automatic generation of word taxonomy. According to our experimental results, among various AutoEncoder models, the stacked AutoEncoder model shows the best performance. Additionally, the F1-score of the two-step approach is 0.991, which is higher than 0.987 and 0.903 of the existing dark-GloVe and dark-BERT models. By analyzing the experimental results of the three proposed methods, we can gain a deeper understanding of drug and sex crimes.

crime, cybercrime, sex crime, (15 more...)

arXiv.org Artificial Intelligence

2403.10838

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Optimizing Affinity-Based Binary Hashing Using Auxiliary Coordinates

Neural Information Processing SystemsMar-12-2024, 17:27:49 GMT

artificial intelligence, hash function, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Merced County > Merced (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction

Timans, Alexander, Straehle, Christoph-Nikolas, Sakmann, Kaspar, Nalisnick, Eric

arXiv.org Machine LearningMar-11-2024

Quantifying a model's predictive uncertainty is essential for safety-critical applications such as autonomous driving. We consider quantifying such uncertainty for multi-object detection. In particular, we leverage conformal prediction to obtain uncertainty intervals with guaranteed coverage for object bounding boxes. One challenge in doing so is that bounding box predictions are conditioned on the object's class label. Thus, we develop a novel two-step conformal approach that propagates uncertainty in predicted class labels into the uncertainty intervals for the bounding boxes. This broadens the validity of our conformal coverage guarantees to include incorrectly classified objects, ensuring their usefulness when maximal safety assurances are required. Moreover, we investigate novel ensemble and quantile regression formulations to ensure the bounding box intervals are adaptive to object size, leading to a more balanced coverage across sizes. Validating our two-step approach on real-world datasets for 2D bounding box localization, we find that desired coverage levels are satisfied with actionably tight predictive uncertainty intervals.

calibration, class label, prediction, (15 more...)

arXiv.org Machine Learning

2403.07263

Country:

Asia > Middle East > Jordan (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.48)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)

Add feedback

Discrete-time Competing-Risks Regression with or without Penalization

Meir, Tomer, Gorfine, Malka

arXiv.org Machine LearningNov-14-2023

Many studies employ the analysis of time-to-event data that incorporates competing risks and right censoring. Most methods and software packages are geared towards analyzing data that comes from a continuous failure time distribution. However, failure-time data may sometimes be discrete either because time is inherently discrete or due to imprecise measurement. This paper introduces a novel estimation procedure for discrete-time survival analysis with competing events. The proposed approach offers two key advantages over existing procedures: first, it expedites the estimation process for a large number of unique failure time points; second, it allows for straightforward integration and application of widely used regularized regression and screening methods. We illustrate the benefits of our proposed approach by conducting a comprehensive simulation study. Additionally, we showcase the utility of our procedure by estimating a survival model for the length of stay of patients hospitalized in the intensive care unit, considering three competing events: discharge to home, transfer to another medical facility, and in-hospital death.

artificial intelligence, machine learning, ovember 16, (19 more...)

arXiv.org Machine Learning

2303.01186

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

90% F1 Score in Relational Triple Extraction: Is it Real ?

Saini, Pratik, Pal, Samiran, Nayak, Tapas, Bhattacharya, Indrajit

arXiv.org Artificial IntelligenceOct-27-2023

Extracting relational triples from text is a crucial task for constructing knowledge bases. Recent advancements in joint entity and relation extraction models have demonstrated remarkable F1 scores ($\ge 90\%$) in accurately extracting relational triples from free text. However, these models have been evaluated under restrictive experimental settings and unrealistic datasets. They overlook sentences with zero triples (zero-cardinality), thereby simplifying the task. In this paper, we present a benchmark study of state-of-the-art joint entity and relation extraction models under a more realistic setting. We include sentences that lack any triples in our experiments, providing a comprehensive evaluation. Our findings reveal a significant decline (approximately 10-15\% in one dataset and 6-14\% in another dataset) in the models' F1 scores within this realistic experimental setup. Furthermore, we propose a two-step modeling approach that utilizes a simple BERT-based classifier. This approach leads to overall performance improvement in these models within the realistic experimental setting.

dataset, extraction, relation, (15 more...)

arXiv.org Artificial Intelligence

2302.09887

Country: Asia > India (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback